Digital image encoding and decoding method and digital image encoding and decoding device using the same

ABSTRACT

The present invention provides an encoder and a decoder of digital picture data, and the encoder/decoder can realize high precision transform with less quantity of transferred data, when a parameter of the digital picture data is not an integer but has numbers of digits, to which the Affine transformation can be applicable. The encoder/decoder comprises the following elements: 
     (a) picture compression means for encoding an input picture and compressing the data, 
     (b) coordinates transform means for outputting coordinate data which is obtained by decoding the compressed data and transforming the decoded data into a coordinate system, 
     (c) transformation parameter producing means for producing transformation parameters from the coordinates data, 
     (d) predicted picture producing means for producing predicted picture from the input picture by the transformation parameter, and 
     (e) transmission means for transmitting the compressed picture and the coordinates data.

This Application is a U.S. National Phase Application of PCTInternational Application PCT/JP97/00118.

FIELD OF THE ENGINEERING

The present invention relates to methods of encoding and decoding adigital picture data for storing or transmitting thereof, morespecifically, a method of encoding and decoding the motion informationproducing predicted pictures, and a method of producing an accuratepredicted picture, and an apparatus using the same methods.

BACKGROUND ART

Data compression (=encoding) is required for efficient storing andtransmitting of a digital picture.

Several methods of encoding are available as prior arts such as“discrete cosine transform” (DCT) including JPEG and MPEG, and otherwave-form encoding methods such as “subband”, “wavelet”, “fractal” andthe like. Further, in order to remove redundant signals betweenpictures, a prediction method between pictures is employed, and then thedifferential signal is encoded by wave-form encoding method.

A method of MPEG based on DCT using motion compensation is describedhere. First, resolve an input picture into macro blocks of 16×16 pixels.One macro block is further resolved into blocks of 8×8, and the blocksof 8×8 undergo DCT and then are quantized. This process is called“Intra-frame coding.” Motion detection means including a block matchingmethod detects a prediction macro block having the least errors on atarget macro block from a frame which is time sequentially adjoined.Based on the detected motion, an optimal predicted block is obtained byperforming motion compensation of the previous pictures. A signalindicating a predicted macro block having the least errors is a motionvector. Next, a difference between the target block and itscorresponding predicted block is found, then the difference undergoesDCT, and the obtained DCT coefficients are quantized, which istransmitted or stored together with motion information. This process iscalled “Inter-frame coding.”

At the data receiving side, first, the quantized DCT coefficients aredecoded into the original differential signals, next, a predicted blockis restored based on the motion vector, then, the differential signal isadded to the predicted block, and finally, the picture is reproduced.

A predicted picture is formed in a block by block basis; however, anentire picture sometimes moves by panning or zooming, in this case, theentire picture undergoes motion compensation. The motion compensation ora predicted picture formation involves not only a simple paralleltranslation but also other deformations such as enlargement, reductionand rotation.

The following equations (1)-(4) express movement and deformation, where(x, y) represents a coordinates of a pixel, and (u, v) represents atransformed coordinates which also expresses a motion vector at (x, y).Other variables are the transformation parameters which indicate amovement or a deformation.

(u, v)=(x+e, y+f)  (1)

(u, v)=(ax+e, dy+f)  (2)

(u, v)=(ax+by+e, cx+dy+f)  (3)

(u, v)=(gx²+pxy+ry²+ax+by+e, hx₂+qxy+sy²+cx+dy+f)  (4)

Equation (3) is so called the Affine transform, and this Affinetransform is described here as an example. The parameters of the Affinetransform are found through the following steps:

First, resolve a picture into a plurality of blocks, e.g., 2×2, 4×4,8×8, etc., then find a motion vector of each block through blockmatching method. Next, select at least three most reliable motionvectors from the detected motion vectors. Substitute these three vectorsto equation (3) and solve the six simultaneous equations to find theAffine parameters. In general, errors decrease at the greater number ofselected motion vectors, and the Affine parameters are found by theleast squares method. The Affine parameters thus obtained are utilizedto form a predicted picture. The Affine parameters shall be transmittedto the data receiving side for producing the identical predictedpicture.

However, when a conventional inter-frame coding is used, a targetpicture and a reference picture should be of the same size, and theconventional inter-frame coding method is not well prepared for dealingwith pictures of different sizes.

Size variations of adjoining two pictures largely depend on motions ofan object in these pictures. For instance, when a person standing withhis arms down (FIG. 7A) raises the arms, the size of the rectangleenclosing the person changes (FIG. 7B.) When an encoding efficiency isconsidered, the target picture and reference picture should betransformed into the same coordinates space in order to decrease a codedquantity of the motion vectors. Also, the arrangement of macro blocksresolved from a picture varies depending on the picture size variation.For instance, when the image changes from FIG. 7A to FIG. 7B, a macroblock 701 is resolved into macro blocks 703 and 704, which aresubsequently compressed. Due to this compression, a vertical distortionresulting from the quantization appears on the person's face in thereproduced picture (FIG. 7B), whereby a visual picture quality isdegraded.

Because the Affine transform requires high accuracy, the Affineparameters (a, b, c, d, e, f, etc.) are, in general, real numbers havingnumbers of decimal places. A considerable amount of bits are needed totransmit parameters at high accuracy. In a conventional way, the Affineparameters are quantized, and transmitted as fixed length codes orvariable length codes, which lowers the accuracy of the parameters andthus the highly accurate Affine transform cannot be realized. As aresult, a desirable predicted picture cannot be produced.

As the equations (1)-(4) express, the number of transformationparameters ranges from 2 to 10 or more. When a transformation parameteris transmitted with a prepared number of bits enough for maximum numbersof parameters, a problem occurs, i.e., redundant bits are to betransmitted.

Disclosure of the Invention

The present invention aims to, firstly, provide an encoder and a decoderof a digital picture data for transmitting non-integer transformationparameters of long number of digits, such as the Affine transform, athigh accuracy for less amount of coded data. In order to achieve theabove objective, a predicted picture encoder comprising the followingelements is prepared:

(a) picture compression means for encoding an input picture andcompressing the data,

(b) coordinates transform means for outputting a coordinates data whichis obtained by decoding the compressed data and transforming the decodeddata into a coordinates system,

(c) transformation parameter producing means for producingtransformation parameters from the coordinates data,

(d) predicted picture producing means for producing a predicted picturefrom the input picture by the transformation parameters, and

(e) transmission means for transmitting the compressed picture and thecoordinates data.

Also a digital picture decoder comprising the following elements isprepared:

(f) variable length decoding means for decoding an input compressedpicture data and an input coordinates data,

(g) transformation parameter producing means for producingtransformation parameters from the decoded coordinates data,

(h) predicted picture producing means for producing a predicted picturedata using the transformation parameters,

(i) addition means for producing a decoded picture by adding thepredicted picture and the compressed picture data.

To be more specific, the transformation parameter producing means of theabove digital encoder and decoder produces the transformation parametersusing “N” (a natural number) pieces of pixels coordinates-points and thecorresponding “N” pieces of transformed coordinates-point obtained byapplying a predetermined linear polynomial function to the N pieces ofpixels coordinates-points. Further, the transformation parameterproducing means of the above digital encoder and decoder outputstransformation parameters produced through the following steps: first,input target pictures having different sizes and numbered “1” through“N”, second, set a common spatial coordinates for the above targetpictures, third, compress the target pictures to produce compressedpictures thereof, then, decode the compressed pictures and transformthem into the common spatial coordinates, next, produce expanded(decompressed) pictures thereof and store them, and at the same time,transform the expanded pictures into the common spatial coordinates.

The present invention aims to, secondly, provide a digital pictureencoder and decoder. To be more specific, when pictures of differentsizes are encoded to form a predicted picture, the target picture andreference picture are transformed into the same coordinates space, andthe coordinates data thereof is transmitted, thereby increasing accuracyof detecting a motion and at the same time, reducing the amount of codedquantity for improving picture quality.

In order to achieve the above objective, the predicted picture encoderaccording to the present invention performs the following steps: first,input target pictures having different sizes and numbered “1” through“N”, second, set a common space coordinates for the above targetpictures, third, compress the target pictures to produce compressedpictures thereof, then, decode the compressed pictures and transformthem into the common spatial coordinates, next, produce expandedpictures thereof and store them, and at the same time, transform theexpanded pictures into the common spatial coordinates, thereby producinga first off-set signal (coordinates data), then encode this off-setsignal, and transmit it together with the first compressed picture.

The predicted picture encoder according to the present invention furtherperforms the following steps with regard to the “n” th (n=2, 3, . . . N)target picture after the above steps: first, transform the targetpicture into the common spatial coordinates, second, produce a predictedpicture by referring to an expanded picture of the (n-1)th picture,third, produce a differential picture between the “n” th target pictureand the predicted picture, and then compress it to encode, therebyforming the “n” th compressed picture, then, decode the “n” thcompressed picture, next, transform it into the common spatialcoordinates to produce the “n” th expanded picture, and store it, at thesame time, encode the “n” th off-set signal (coordinates data) which isproduced by transformation the “n” th target picture into the commonspace coordinates, finally transmit it together with the “n” thcompressed picture.

The predicted picture decoder of the present invention comprises thefollowing elements: input terminal, data analyzer (parser), decoder,adder, coordinates transformer, motion compensator and frame memory. Thepredicted picture decoder of the present invention performs thefollowing steps: first, input compressed picture data to the inputterminal, the compressed picture data being numbered from 1 to Nincluding the “n” th off-set signal which is produced by encoding thetarget pictures having respective different sizes and being numbered 1to N, and transforming the “n” th (n=1,2, 3, . . . N) target pictureinto the common spatial coordinates, second, analyze the firstcompressed picture data, and output the first compressed picture signaltogether with the first off-set signal, then input the first compressedpicture signal to the decoder to decode it to the first reproducedpicture, and then, the first reproduced picture undergoes thecoordinates transformer using the first off-set signal, and store thetransformed first reproduced picture in the frame memory. With regard tothe “n” th (n=2, 3, 4, . . . N) compressed picture data, first, analyzethe “n” th compressed picture data in the data analyzer, second, outputthe “n” th compressed picture signal, the “n” th off-set signal and the“n” th motion signal, third, input the “n” th compressed picture signalto the decoder to decode it into the “n” th expanded differentialpicture, next, input the “n” th off-set signal and “n” th motion signalto the motion compensator, then, obtain the “n” th predicted picturefrom the “n−1” th reproduced picture stored in the frame memory based onthe “n” th off-set signal and “n” th motion signal, after that, in theadder, add the “n” th expanded differential picture to the “n” thpredicted picture to restore then into the “n” th reproduced picture,and at the same time, the “n” th reproduced picture undergoes thecoordinates transformer based on the “n” th off-set signal and is storedin the frame memory.

The present invention aims to, thirdly, provide a digital pictureencoder and decoder which can accurately transmit the coordinates dataincluding the transformation parameters having the Affine parameter forthe Affine transform, and can produce an accurate predicted picture.

A digital picture decoder according to the present invention comprisesthe following elements: variable length decoder, differential pictureexpander, adder, transformation parameter generator, predicted picturegenerator and frame memory.

The above digital picture decoder performs the following steps: first,input data to the variable length decoder, second, separate adifferential picture data and transmit it to the differential pictureexpander, at the same time, separate the coordinates data and send it tothe transformation parameter generator, thirdly, in the differentialpicture expander, expand differential picture data, and transmit it tothe adder, next, in the transformation parameter generator, produce thetransformation parameters from the coordinates data, and transmit it tothe predicted picture generator, then, in the predicted picturegenerator, produce the predicted picture using the transformationparameters and the picture input from the frame memory, and transmit thepredicted picture to the adder, where the predicted picture is added tothe expanded differential picture, finally, produce the picture tooutput, at the same time, store the picture in the frame memory.

The above coordinates data represent either one of the following cases:

(a) the coordinates points of N pieces of pixels and the corresponding Npieces of transformed coordinates points obtained by applying thepredetermined linear polynomial function to the coordinates points of Npieces of pixels, or

(b) a differential value between the coordinates points of N pieces ofpixels and the corresponding N pieces of transformed coordinates pointsobtained by applying the predetermined linear polynomial to thecoordinates points of the N pieces of pixels, or

(c) N pieces of transformed coordinates points obtained by applying apredetermined linear polynomial to predetermined N pieces for each ofthe coordinates points, or

(d) differential values between the N pieces of transformed coordinatespoints obtained by applying the predetermined linear polynomial functionto predetermined N pieces of coordinates point and predicted values.These predicted values represent the predetermined N pieces coordinatespoints, or N pieces transformed coordinates points of the previousframe.

A digital picture encoder according to the present invention comprisesthe following elements: transformation parameter estimator, predictedpicture generator, first adder, differential picture compressor,differential picture expander, second adder, frame memory andtransmitter.

The above digital picture encoder performs the following steps: first,input a digital picture, second, in the transformation parameterestimator, estimate each of the transformation parameters using thepicture stored in the frame memory and the digital picture, third, inputthe estimated transformation parameters together with the picture storedin the frame memory to the predicted picture generator, next, produce apredicted picture based on the estimated transformation parameters, thenin the first adder, find a difference between the digital picture andthe predicted picture, after that, in the differential picturecompressor, compress the difference into compressed differential data,then transmit the data to the transmitter, at the same time, in thedifferential picture expander, expand the compressed differential datainto an expanded differential data, then, in the second adder, thepredicted picture is added to the expanded differential data, next,store the added result in the frame memory. To be more specific, thecoordinates data is transmitted from the transformation parameterestimator to the transmitter, and they are transmitted together with thecompressed differential data.

The above coordinates data comprises either one of the following cases:

(a) the coordinates points of N pieces of pixels and the corresponding Npieces of transformed coordinates points obtained by applyingtransformation using the transformation parameters, or

(b) the coordinates points of N pieces of pixels as well as each of thedifferential values between the coordinates points of N pieces of pixelsand the N pieces of transformed coordinates points, or

(c) N pieces of coordinates points transformed from each of thepredetermined N pieces coordinates points of pixels, or

(d) each of the differential values between the N pieces of coordinatespoints transformed from the predetermined N pieces coordinates points ofpixels, or

(e) each of the differential values between N pieces transformedcoordinates points and those of a previous frame.

A digital picture decoder according to the present invention comprisesthe following elements: variable length decoder, differential pictureexpander, adder, transformation parameter generator, predicted picturegenerator and frame memory.

The above digital picture decoder performs the following steps: first,input data to the variable length decoder, second, separate adifferential picture data and transmit it to the differential pictureexpander, at the same time, input the number of coordinates datatogether with the coordinates data to the transformation parametergenerator, thirdly, in the differential picture expander, expanddifferential picture data, and transmit it to the adder, next, in thetransformation parameter generator, change transformation parametergeneration methods depending on the number of the transformationparameters, then, produce the transformation parameters from thecoordinates data, and transmit it to the predicted picture generator,then, in the predicted picture generator, produce the predicted pictureusing the transformation parameters and the picture input from the framememory, and transmit the predicted picture to the adder, where thepredicted picture is added to the expanded differential picture,finally, produce the picture to output, at the same, store the picturein the frame memory.

The above coordinates data represent either one of the following cases:

(a) the coordinates points of N pieces of pixels and the corresponding Npieces of transformed coordinates points obtained by transforming thecoordinates points of N pieces of pixels by using the predeterminedlinear polynomial function, or

(b) the coordinates points of N pieces of pixels and each of thedifferential values between the coordinates points of N pieces of pixelsand the corresponding N pieces of transformed coordinates pointsobtained by transforming the coordinates points of N pieces of pixels byusing the predetermined linear polynomial function, or

(c) the N pieces of coordinates points transformed from thepredetermined N pieces of coordinates points by the predetermined linearpolynomial, or

(d) differential values between the coordinates points of N pixels andthe coordinates points of N pieces of pixels of the previous frame, anddifferential values of the N pieces of transformed coordinates pointsobtained by the predetermined linear polynomial and the N piecestransformed coordinates points in the previous frame, or

(e) N pieces of coordinates points transformed from the predetermined Npieces coordinates points by the predetermined linear polynomial, or

(f) differential values between the N pieces of coordinates pointstransformed from the predetermined N pieces of coordinates points by thepredetermined linear polynomial and the predetermined N piecescoordinates points, or

(g) differential values between the N pieces of coordinates pointstransformed from the predetermined N pieces coordinates points by thepredetermined linear polynomial and those in the previous frame.

When the transformation parameters are transmitted, the transformationparameters are multiplied by the picture size, and then quantized beforethe transformation parameter is encoded, or an exponent of the maximumvalue of transformation parameter is found, and the parameters arenormalized by the exponent, then the normalized transformationparameters together with the exponent are transmitted.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram depicting a predicted picture encoderaccording to the present invention.

FIG. 2 is a first schematic diagram depicting a coordinates transformused in a first and a second exemplary embodiments of the presentinvention.

FIG. 3 is a bit stream depicting encoded picture data by a predictedpicture encoder used in the first exemplary embodiment of the presentinvention.

FIG. 4 is a second schematic diagram depicting coordinates transformused in the first and second exemplary embodiments.

FIG. 5 is a block diagram depicting a predicted picture decoder used inthe second exemplary embodiment of the present invention.

FIG. 6 is a schematic diagram depicting a resolved picture in the firstand second exemplary embodiment.

FIG. 7 is a schematic diagram depicting a picture resolved by aconventional method.

FIG. 8 is a block diagram depicting a digital picture decoder used inthe third exemplary embodiment.

FIG. 9 is a block diagram depicting a digital picture encoder used inthe third exemplary embodiment.

FIG. 10 is a block diagram depicting a digital picture decoder used inthe fourth exemplary embodiment.

FIG. 11 is a block diagram depicting a digital picture decoder used inthe fifth exemplary embodiment.

FIG. 12 is a block diagram depicting a digital picture encoder used inthe fifth exemplary embodiment.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The exemplary embodiments of the present invention are detailedhereinafter by referring to FIGS. 1-12.

(Embodiment 1)

FIG. 1 is a block diagram depicting a predicted picture encoderaccording to the present invention. FIG. 1 lists the following elements:input terminal 101, first adder 102, encoder 103, output terminal 106,decoder 107, second adder 110, first coordinates transformer 111, secondcoordinates transformer 112, motion detector 113, motion compensator114, and frame memory 115.

The predicted picture encoder having the above structure operates asfollows:

(a) Input the target pictures numbered 1-N and having respectivedifferent sizes into the input terminal 101, where N is determineddepending on a video length. First of all, input the first targetpicture to the input terminal 101, via the first adder 102, the firsttarget picture is compressed in the encoder 103. In this case, the firstadder 102 does not perform a subtraction. In this exemplary embodiment,the target picture is resolved into a plurality of adjoining blocks (8×8pixels), and a signal in spatial domain is transformed into frequencydomain to form a transformed block by discrete cosine transform (DCT)104. The transformed block is quantized by a quantizer 105 to form afirst compressed picture, which is output to the output terminal 106.This output is converted into fixed length codes or variable lengthcodes and then transmitted (not shown.) At the same time, the firstcompressed picture is restored into an expanded picture by the decoder107.

(b) In this exemplary embodiment, the first compressed picture undergoesan inverse quantizer IQ 108 and an inverse discrete cosine transformer(IDCT) 109 to be transformed eventually to spatial domain. A reproducedpicture thus obtained undergoes the first coordinates transformer 111and is stored in the frame memory 115 as a first reproduced picture.

(c) The first coordinates transformer 111 is detailed here. FIG. 2A isused as the first target picture. A pixel “Pa” of a picture 201 has acoordinates point (0, 0) in the coordinates system 203. Anothercoordinates system 205 is established in FIG. 2C, which may be acoordinates system of display window or that of the target picture ofwhich center is the origin of the coordinates system. In either event,the coordinates system 205 should be established before encoding isstarted. FIG. 2C shows a mapping of the target picture 201 in thecoordinates system 205. The pixel “Pa” of the target picture 201 istransformed into (x_a, y_a) due to this coordinates transform. Thecoordinates transform sometimes includes a rotation. The value of x_a,y_a is encoded into a fixed length and in 8 bit form, then it istransmitted with the first compressed picture.

(d) Input the “n” th (n=2, 3, . . . , N) target picture to the inputterminal 101. Input the “n” th target picture into the secondcoordinates transformer 112 via a line 126, and transform it into thecoordinates system 205. A picture 202 in FIG. 2B is used as the “n” thtarget picture. Map this target picture in the coordinates system 205,and transform the coordinates point of pixel “b1” into (x_b, y_b) asshown in FIG. 2C. Then, input the target picture 202 undergone thecoordinates transform into the motion detector 113, and resolve it to aplurality of blocks, then detect a motion using a block matching methodor others by referring to the “n-1” th reproduced picture, therebyproducing a motion vector. Next, output this motion vector to a line128, and encode it to transmit (not shown), at the same time, send it tothe motion compensator 114, then, produce a predicted block by accessingthe “n-1” th reproduced picture stored in the frame memory 115. Examplesof the motion detection and motion compensation are disclosed in U.S.Pat. No. 5,193,004.

(e) Input the blocks of the “n” th target picture and the predictedblocks thereof into the first adder 102, and produce the differentialblocks. Next, compress the differential blocks in the encoder 103, thenproduce the “n” th compressed picture and outputs it to the outputterminal 106, at the same time, restore it to an expanded differentialblock in the decoder 107. Then, in the second adder 110, add thepredicted block sent through a line 125 to the expanded differentialblock, thereby reproducing the picture. Input the picture thusreproduced to the first coordinates transformer 111, and apply thecoordinates transform to the picture as same as the picture 202 in FIG.2C, and store it in the frame memory 115 as the “n” th reproducedpicture, at the same time, encode the coordinates point (x_b, y_b) ofthe pixel “b1”, and transmit this encoded data together with the “n” thcompressed picture.

(f) FIG. 3 is a bit stream depicting encoded picture data by a predictedpicture encoder used in the exemplary embodiment of the presentinvention. On the top of the encoded picture data, a picture sync.signal 303 exists, next is a parameter x_a 304, y_a 305 undergone thecoordinates transform, then picture size 306, 307, and a step value 308used for quantization, after that the compressed data and the motionvector follow. In other words, the parameter x_a 304, y_a 305 andpicture size 306, 307 are transmitted as a coordinates data.

(g) FIG. 4 shows another mode of coordinates transform used in theexemplary embodiments of the present invention. In this case, resolvethe target picture into a plurality of regions, and apply thecoordinates transform to each region. For instance, resolve the picture201 into three regions, R1, R2, and R3, then, compress and expand eachregion, after that, apply the coordinates transform to each reproducedR1, R2 and R3 in the first coordinates transformer 111, then store themin the frame memory 115. Encode parameters (x_a1, y_a1), (x_a2, y_a2),and (x_a3, y_a3) to be used in the coordinates transform simultaneously,and transmit the encoded parameters.

(h) Input the picture 202, and resolve it into regions R4, R5 and R6.Apply the coordinates transform to each region in the second coordinatestransformer 112. Each transformed region undergoes the motion detectorand motion compensator by referring the regions stored in the framememory 115, then produce a predicted signal, and produce a differentialsignal in the first adder 102, next, compress and expand thedifferential signal, and add the predicted signal thereto in the secondadder. Each region thus reproduced undergoes the coordinates transformand is stored in the frame memory 115. Encode parameters (x_b1, y_b1),(x_b2, y_b2), and (x_b3, y_b3) to be used in the coordinates transformsimultaneously and transmit them.

Pictures of different sizes are transformed into a common spatialcoordinates, thereby increasing an accuracy of motion detection andreducing coded quantity of the motion vector, as a result, picturequality is improved. The coordinates of pictures in FIGS. 6A and 6Balign at point 605, whereby motion can be correctly detected because theblocks 601 and 603 are identical, and 602 and 604 are identical. Furtherin this case, the motion vectors of blocks 603 and 604 are nearly zero,thereby reducing the coded quantity of the motion vector. In general,the same manner is applicable to two adjoining pictures. As opposed toFIG. 7B, since the face drawn in the block 603 in FIG. 6B is containedwithin one block, a vertical distortion resulting from quantization doesnot appear on the face.

(Embodiment 2)

FIG. 5 is a block diagram depicting a predicted picture decoder used inthe second exemplary embodiment of the present invention. FIG. 5 liststhe following elements: input terminal 501, data analyzer 502, decoder503, adder 506, output terminal 507, coordinates transformer 508, motiondetector 509, frame memory 510.

An operation of the predicted picture encoder comprising the aboveelement is described here. First, to the input terminal 501, inputcompressed picture data and numbered 1 through N including a “n” thtransformation parameter which is produced by encoding target pictureshaving respective different sizes and numbered 1 through N andtransforming the “n” th (n=1, 2, 3, . . . , N) target picture into acommon spatial coordinates. FIG. 3 is a bit stream depicting an exampleof compressed picture data. Second, analyze the input compressed picturedata by the data analyzer 502.

Analyze the first compressed picture data by the data analyzer 502, andthen, output the first compressed picture to the decoder 503. Send firsttransformation parameters (x_a, y_a, as shown in FIG. 2C), which isproduced by transforming the first picture into the common spacecoordinates, to the coordinates transformer 508. In the decoder 503,decode the first compressed picture to an expanded picture, and thenoutput it to the output terminal 507, at the same time, input theexpanded picture to the coordinates transformer 508. In this secondembodiment, the expanded picture undergoes an inverse quantization andIDCT before being restored to a signal of the spatial domain. In thecoordinates transformer 508, map the expanded picture in the commonspatial coordinates system based on the first transformation parameter,and then, output it as a first reproduced picture, and store this in theframe memory 510. Regarding the coordinates transform, the same methodas in the first embodiment is applied to this second embodiment.

Next, analyze the “n” th (n=2, 3, 4, . . . , N) compressed picture databy the data analyzer 502, and output the “n” th differential compressedpicture to the decoder 503. Send the “n” th motion data to the motioncompensator 509 via a line 521. Then, send the “n” th transformationparameter (x_b, y_b, as shown in FIG. 2C), which is produced bytransforming the “n” th picture into the common spatial coordinates, tothe coordinates transformer 508 and the motion compensator 509 via aline 520. In the decoder 503, restore the “n” th differential compressedpicture to the “n” th expanded differential picture, and output this tothe adder 506. In this second embodiment, a differential signal of thetarget block undergoes the inverse quantization and IDCT, and is outputas an expanded differential block. In the motion compensator 509, apredicted block is obtained from the frame memory 510 using the “n” thtransformation parameters and the motion vector of the target block. Inthis second embodiment, the coordinates of the target block istransformed using the transformation parameter. In other words, add thetransformation parameter (e.g., x_b, y_b, as shown in FIG. 2C) to thecoordinates of the target block, and add the motion vector to this sum,thereby determine an address in the frame memory 510. Send the predictedblock thus obtained to the adder 506, and is added to the expandeddifferential block, thereby reproduce the picture. Then, output thereproduced picture to the output terminal 507, at the same time, thereproduced picture undergoes the coordinates transformer 508 using the“n” th transformation parameter, and is stored in the frame memory 510.The coordinates transformer 508 can be replaced by the motioncompensator 509 or other apparatuses which has the following function:Before and after the target block, add a difference between theparameters of the “n” th picture and “n−1” th picture, i.e.,(x_b−x_(—a, y)_b−y_a) to the target block, and to this sum, add themotion vector. Instead of the coordinates transformer 508, the addressin the frame memory 510 can be determined using one of the abovealternatives.

A case where another compressed picture data is input to the inputterminal 501 is discussed hereinafter; Input compressed pictures datanumbered 1 through N including transformation parameters which can beproduced by resolving the target pictures numbered 1 through N havingrespective different sizes into a respective plurality of regions, andencoding each region, then transforming respective regions into thecommon spatial coordinates.

First, analyze a first compressed picture data in the data analyzer 502,and output the “m” th (m=1, 2, . . . , M) compressed region to thedecoder 503. In FIG. 4A, this is exampled by M=3. Then, send the “m” thtransformation parameter (x_am, y_am; as shown in FIG. 4A), which isproduced by transforming the “m” th compressed region into the commonspatial coordinates, to the coordinates transformer 508 via a line 520.In the decoder 503, restore the “m” th compressed region to the “m” thexpanded region, and then, output this to the output terminal 507, atthe same time, input the “m” th expanded region to the coordinatestransformer 508. Map the “m” th expanded region in the common spacecoordinates system based on the “m” th transformation parameter, andoutput this as the “m” th reproduced region, finally store thereproduced region in the frame memory 510. The method is same as theprevious one.

Second, analyze the “n” th (n=1. 2. 3. . . , N) compressed picture datain the data analyzer 502, and output the “k” th (k=1, 2, . . . , K)differential compressed region in the data to the decoder 503. In FIG.4B, this is exampled by K=3. Also send the corresponding motion data tothe motion detector 509 via a line 521, then transform the data into thecommon spatial coordinates, thereby producing the “k” th transformationparameter (x_bk, y_bk, k=1, 2, 3 in FIG. 4B). Send this parameter to thecoordinates transformer 508 and the motion compensator 509 via the line520. In the decoder 503, restore the “k” th differential compressedregion to an expanded differential region, and then output it to theadder 506. In this second embodiment, the differential signal of thetarget block undergoes an inverse quantization and IDCT before beingoutput as an expanded differential block. In the motion compensator 509,a predicted block is obtained from the frame memory 510 using the “k” thtransformation parameter and the motion vector of the target block. Inthis second embodiment, a coordinates of the target block is transformedusing the “k” th transformation parameter. In other words, add thetransformation parameter (e.g., x_bk, y_bk, as shown in FIG. 4B) to thecoordinates of the target block, and add the motion vector to this sum,thereby determine an address in the frame memory 510. Send the predictedblock thus obtained to the adder 506, and is added to the expandeddifferential block, thereby reproduce the picture. Then, output thereproduced picture to the output terminal 507, at the same time, thereproduced picture undergoes the coordinates transformer 508, and isstored in the frame memory 510.

(Embodiment 3)

FIG. 8 is a block diagram depicting a decoder utilized in this thirdexemplary embodiment. The decoder comprises the following elements:input terminal 801, variable length decoding part 802, differentialpicture expanding part 803, adding part 804, output terminal 805,transformation parameter producing part 806, frame memory 807 andpredicted picture producing part 808.

First, input a compressed picture data to the input terminal 801,second, in the variable length decoding part 802, analyze the input dataand separate differential picture data as well as coordinates data fromthe input data, third, send these separated data to the differentialpicture expanding part 803 and the transformation parameter producingpart 806 via lines 8002 and 8003 respectively. The differential picturedata includes a quantized transformed (DCT) coefficients and aquantization stepsize (scale). In the differential picture expandingpart 803, apply an inverse quantization to the transformed DCTcoefficients using the quantization stepsize, and then, apply an inverseDCT thereto for expanding to the differential picture.

The coordinates data include the data for producing transformationparameters, and the transformation parameters are produced by thetransformation parameter producing part 806, e.g., in the case of theAffine transform expressed by the equation (3), parameters a, b, c, d,e, and f are produced, which is detailed hereinafter.

First, input the transformation parameters produced by thetransformation parameter producing part 806 and the picture to be storedin the frame memory into the predicted picture producing part 808. Inthe case of the Affine transform expressed by the equation (3), thepredicted value for a pixel at (x, y) is given by a pixel at (u, v) ofthe image stored in the frame memory according to equation (3) using thetransformation parameters (a, b, c, d, e, f) sent from thetransformation parameter producing part 806. The same practice can beapplicable to the equation (1), (2), and (4).

Send the predicted picture thus obtained to the adding part 804, where adifferential picture is added to, then, reproduce the picture. Outputthe reproduced picture to the output terminal 805, at the same time,store the reproduced picture in the frame memory 807.

The coordinates data described above can be in a plural form, which isdiscussed here.

Hereinafter the following case is discussed: a coordinates datacomprises the coordinates points of “N” pieces of pixels, and the “N”pieces coordinates points transformed by the predetermined linearpolynomial, where “N” represents a number of points required for findingtransformation parameters. In the case of the Affine parameter, thereare six parameters, thus six equations are needed to solve sixvariables. Since one coordinates point has (x, y) components, six Affineparameters can be solved in the case of N=3. N=1, N=2 and N=5 areapplicable to the equation (1), (2) and (4) respectively. The “N” piecesof transformed coordinates points are motion vectors and correspond tothe (u, v) components on the left side of equation (4).

In the case of the Affine transform, three coordinates points i.e., (x0,y0), (x1, y1) and (x2, y2), and three transformed coordinates points,i.e., (u0, v0), (u1, v1) and (u2, v2) are input into the transformationparameter producing part 806 via a line 8003. In the transformationparameter producing part 806, the Affine parameter can be obtained bysolving the following simultaneous equations.

(u0, v0)=(ax0+by0+e, cx0+dy0+f)

(u1,v1)=(ax1+by1+e, cx1+dy1+f)  (5)

(u2, v2)=(ax2+by2+e, cx2+dy2+f)

The transformation parameters can be obtained using more coordinatesdata. For other cases, given be equations (1), (2) and (4), thetransformation parameters can be solved in the same manner. To obtainthe transformation parameters at high accuracy, the N coordinates points(x, Y) have to appropriately chosen. Preferably the N points are locatedperpendicular between each other.

When the coordinates points (x0, y), (x1, y1) and (x2, y2) are requiredfor the given transformed coordinates points (u0, v0), (u1, v1) and (u2,v2), the simultaneous equations (6) instead of the equations (5) can besolved.

(x0, y0)=(Au0+Bv0+E, Cu0+Dv0+F)

 (x1, y1)=(Au1+Bv1+E, Cu1+Dv1+F)  (6)

(x2, y2)=(Au2+Bv2+E, Cu2+Dv2+F)

Hereinafter the following case is discussed: a coordinates datacomprises the coordinates points of “N” pieces of pixels, anddifferential values of the “N” pieces coordinates points transformed bythe predetermined linear polynomial. When the predicted values forobtaining a difference are the coordinates points of “N” pieces pixels,the transformation parameter is produced through the following steps:first, in the transformation parameter producing part 806, add thedifferential values between the coordinates points of the “N” piecespixels and the “N” pieces of transformed coordinates points, and then,producing the transformation parameters using the “N” pieces pixelscoordinates points and the added “N” pieces transformed coordinatespoints. When the predicted values for obtaining the difference are thetransformed coordinates points of the “N” pieces pixels of the previousframe, in the transformation parameter producing part 806, thetransformed coordinates points of the “N” pieces pixels in the previousframe are added to the differential values to restore N transformedcoordinates points of the current frame. The transformation parametersare then calculated from the “N” pieces pixels coordinates points andthe restored N transformed coordinates points. The restored Ntransformed coordinates points are stored as prediction values for thepreceding frames.

Next, the following case is discussed here: the coordinates data is the“N” pieces coordinates points transformed from a predetermined “N”pieces coordinates points by a predetermined linear polynomial. It isnot necessarily to transmit the “N” pieces coordinates points becausethey are predetermined. In the transformation parameter producing part806, the transformation parameters are produced using the coordinatespoints of the predetermined “N” pieces pixels and the transformedcoordinates points.

Then the following case is considered where: the coordinates points arethe differential values of the “N” pieces of transformed coordinatespoints obtained by applying the predetermined linear polynomial functionto the predetermined “N” pieces coordinates points. In the case whereprediction values for obtaining the difference are the predetermined “N”pieces coordinates points, in the transformation parameter producingpart 806, the predetermined “N” pieces coordinates points are added tothe difference to retrieved the transformed coordinates points. Then thetransformation parameters are calculated from the predetermined “N”pieces coordinates points and the transformed coordinates points thusretrieved. When the predicted values for obtaining the difference arethe transformed coordinates points of the “N” pieces pixels of theprevious frame, in the transformation parameter producing part 806, thetransformed coordinates points of the “N” pieces pixels in the previousframe are added to the differential values to retrieve N transformedcoordinates points of the current frame. The transformation parametersare then calculated from the “N” pieces pixels coordinates points andthe retrieved N transformed coordinates points. The retrieved Ntransformed coordinates points are stored as prediction values for thepreceding frames.

FIG. 9 is a block diagram depicting an encoder utilized in the thirdexemplary embodiment of the present invention. The encoder comprises thefollowing elements: input terminal 901, transformation parameterestimator 903, predicted picture generator 908, first adder 904,differential picture compressor 905, differential picture expander 910,second adder 911, frame memory 909 and transmitter 906. First, input adigital picture to the input terminal 901. Second, in the transformationparameter estimator 903, estimate a transformation parameter using apicture stored in the frame memory and the input digital picture. Theestimating method of the Affine parameters was already describedhitherto.

Instead of the picture stored in the frame memory, an original picturethereof can be used. Third, send the estimated transformation parametersto the predicted picture generator 908 via a line 9002, and send thecoordinates data transformed by the transformation parameters to thetransmitter 906 via a line 9009. The coordinates data can be in aplurality of forms as already discussed. Input the estimatedtransformation parameters and the picture stored in the frame memory 909to the predicted picture generator 908, and then produce the predictedpicture based on the estimated transformation parameters. Next, in thefirst adder 904, find a difference between the digital picture and thepredicted picture, then compress the difference into a differentialcompressed data in the differential picture compressor 905, then sendthis to the transmitter 906. In the differential picture compressor 905,apply DCT to the compressed data and quantize the data, at the sametime, in the differential picture expander 910, the inverse quantizationand inverse DCT is applied. In the second adder, the expandeddifferential data is added to the predicted picture, and the result isstored in the frame memory. In the transmitter 906, encode thedifferential compressed data, quantized width and the coordinates data,then multiplex them, and transmit to store them.

(Embodiment 4)

FIG. 10 depicts a digital picture decoder utilized in a fourth exemplaryembodiment. The decoder comprises the following elements: input terminal1001, variable length decoder 1002, differential picture expander 1003,adder 1004, transformation parameter generator 1008 and frame memory1007. Since the basic operation is the same as that described in FIG. 8,only the different points are explained here. The transformationparameter generator 1006 can produce plural types of parameters. Aparameter producing section 1006 a comprises means for producing theparameters (a, e, d, f) expressed by the equation (2), a parameterproducing section 1006 b comprises means for producing the parameters(a, b, e, c, d, f) expressed by the equation (3), and a parameterproducing section 1006 c comprises means for producing the parameters(g, p, r, a, b, e, h, q, s, c, d, f) expressed by the equation (4). Theequations (2), (3) and (4) require two coordinates points, sixcoordinates points and 12 coordinates points respectively for producingparameters. These numbers of coordinates points control switches 1009and 1010 via a line 10010. When the number of coordinates points aretwo, the switches 1009 and 1010 are coupled with a terminals 1011 a and1012 a respectively, and the coordinates data is sent to the parameterproducing section 1006 a via a line 10003, and simultaneous equationsare solved, thereby producing the parameters expressed by the equation(2), and the parameters are output from the terminal 1012 a. When thenumber of coordinates points are three and six, respective parameterproducing sections 1006 b and 1006 c are coupled to terminals 1011 b,1012 b and terminals 1011 c, 1012 c respectively. According to theinformation about the number of coordinates points, a type ofcoordinates data to be transmitted can be identified, and whereby thetransformation parameters can be produced responsive to the numbers. The15 form of the coordinates data runs through the line 10003 has beenalready discussed. When the right sides of the equations (2)-(4), i.e.,(x, y) are known quantities, it is not necessary to transmit thesevalues, therefore, the number of coordinates points running through theline 10010 can be one for the equation (2), three for (3) and six for(4). Further the transformation parameter producing sections are notlimited to three but can be more than three.

(Embodiment 5)

FIG. 11 and FIG. 12 are block diagrams depicting a digital picturedecoder and encoder respectively. These drawings are basically the sameas FIGS. 8 and 9, and yet, there are some different points as follows:instead of the transformation parameter generator 806, a transformationparameter expander 1106 is employed, and an operation of a parameterestimator 1203 is different from that of the parameter estimator 903.These different points are discussed here. In the transformationparameter 1203 of FIG. 12, first, estimate the transformation parameter,then, multiply it by a picture size, second, quantize the multipliedtransformation parameters, and send it to the transmitter 1206 via aline 12009. The transformation parameter is a real number, which shouldbe rounded to an integer after being multiplied. In the case of theAffine parameter, the parameters (a, b, c, d) should be expressed with ahigh accuracy. Parameters of vertical coordinates “a” and “c” aremultiplied by a number of pixels “V” in the vertical direction, andparameters of horizontal coordinates “b” and “d” are multiplied by anumber of pixels “H” in the horizontal direction. In the case ofequation (4) having a square exponent term, the picture size formultiplying can be squared (H², V₂, HV.) In the transformation parameterexpander 1106 of FIG. 11, the multiplied parameter is divided, and theparameter is reproduced. In the transformation parameter estimator 1203of FIG. 12, estimate the transformation parameters, and then find themaximum value of the transformation parameter. An absolute maximum valueis preferable. The transformation parameters are normalized by anexponent part of the maximum value (preferably an exponent part of asecond power), i.e., Each transformation parameter is multiplied by avalue of the exponent part. Send the transformation parameters thusnormalized and the exponent to the transmitter 1206, and transform theminto a fixed length code before transmitting. In the transformationparameter expander 1106 of FIG. 11, divide the normalized parameters bythe exponent, and expand these to the transformation parameters. In thecase of the Affine parameters (a, b, c, d), find the maximum value among(a, b, c, d.) In this case, the parameter of parallel translation (e, f)can be included; however, since these parameters typically have adifferent number of digits from the Affine parameters, it had better notbe included. The same practice can be applied to the parameters ofequation (4), and it is preferable to normalize a square exponent(second order) term and a plain (first order) term independently, but itis not limited to this procedure.

In all the above exemplary embodiments, the descriptions cover the caseswhere a differential picture is non-zero; however, when the differentialpicture is perfectly zero, the same procedure can be applicable. In thiscase, a predicted picture is output as it is. Also the descriptionscover the transformation of an entire picture; however, the samedescription is applicable to a case where two dimensional orthree-dimensional picture is resolved into plural small regions, and oneof transforms including the Affine transform is applied to each smallregion.

Industrial Applicability

According to the present invention as described in the aboveembodiments, pictures of different sizes are transformed into the samecoordinates system, and motions thereof are detected, and thus apredicted picture is produced, thereby increasing an accuracy of amotion detection, and at the same time, decreasing coded quantity ofmotion vectors. On the decoder side, a transformation parameter isobtained from coordinates data, which results in producing a highlyaccurate transformation parameter and a highly accurate predictedpicture. Further, normalizing the transformation parameter as well asmultiplying it by a picture size can realize a transmitting of theparameter with a responsive accuracy to the picture. And also, thetransformation parameter can be produced responsive to a number ofcoordinates data, which can realize an optimal process of producing thetransformation parameter, and an efficient transmission of thecoordinates data.

What is claimed is:
 1. A predicted decoding method for decoding abitstream data obtained by encoding a target image to be encodedreferring a reference image, wherein a size or a location of thereference image is not as same as one of the target image to be encoded,a location of the target image on a first coordinate system is definedon a common spatial coordinate system, the reference image is located atthe common spatial coordinate system, the predicted decoding methodcomprising: (a) extracting an offset signal and a compressed imagesignal related to a target image to be decoded from the bitstream data,wherein the offset signal indicates relationship between an origin ofthe common spatial coordinate system and an origin of the firstcoordinate system which is a spatial position in a top and left cornerof the target image to be decoded; (b) obtaining a predicted image fromthe reference image referring the offset signal. (c) producing adecompressed differential image on the first coordinate system, bydecoding the compressed image data through a process of inversequantization and inverse DCT; (d) reproducing a reproduced image on thecommon spatial coordinate system by summing the predicted image and therecompressed differential image, by referring the offset signal.
 2. Apredicted decoding method according to claim 33, the predicted decodingmethod further comprising: extracting a motion vector data from thebitstream data; decoding the motion vector data in order to reproduce amotion vector; wherein the predicted image is obtained based on themotion vector from the reference image.
 3. A predicted decoding methodaccording to claim 33, wherein a location of the target images to beencoded is different per frame.
 4. A predicted decoding method accordingto claim 34, wherein a location of the target images to be encoded isdifferent per frame.